A MapReduce Implementation of C4.5 Decision Tree Algorithm
نویسندگان
چکیده
منابع مشابه
A MapReduce Implementation of C4.5 Decision Tree Algorithm
Recent years have witness the development of cloud computing and the big data era, which brings up challenges to traditional decision tree algorithms. First, as the size of dataset becomes extremely big, the process of building a decision tree can be quite time consuming. Second, because the data cannot fit in memory any more, some computation must be moved to the external storage and therefore...
متن کاملPDTSSE: A Scalable Parallel Decision Tree Algorithm Based on MapReduce
Parallel decision tree learning is an effective and efficient approach to scaling the decision tree to large data mining application. Aiming at large scale decision tree learning, we present a novel parallel decision tree learning algorithm in MapReduce framework, called PDTSSE (Parallel Decision Tree via Sampling Splitting points with Estimation). We first propose an estimation method for samp...
متن کاملMR-Tree - A Scalable MapReduce Algorithm for Building Decision Trees
Learning decision trees against very large amounts of data is not practical on single node computers due to the huge amount of calculations required by this process. Apache Hadoop is a large scale distributed computing platform that runs on commodity hardware clusters and can be used successfully for data mining task against very large datasets. This work presents a parallel decision tree learn...
متن کاملA Multi-relational Decision Tree Learning Algorithm - Implementation and Experiments
We describe an efficient implementation (MRDTL-2) of the Multirelational decision tree learning (MRDTL) algorithm [19] which in turn was based on a proposal by Knobbe et al. [15] We describe some simple techniques for speeding up the calculation of sufficient statistics for decision trees and related hypothesis classes from multi-relational data. Because missing values are fairly common in many...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Database Theory and Application
سال: 2014
ISSN: 2005-4270
DOI: 10.14257/ijdta.2014.7.1.05